The contents of this directory contain all materials to replicate the results in Ying, Montgomery and Stewart's "Topics, Concepts, and Measurement: A Crowdsourced Procedure for Validating Topics as Measures."

We have included two folders. The "0_preparation" folder contains extensive code to fit topic models, set up the environment for our proposed crowdsourced procedure, and validate the topics and labels via Mturk. The "1_replication" folder contains data and code to replicate the results in the article and supplementary appendix. 

-- For readers who want to replicate the results in the article, the "1_replication" folder would be enough. Specifically,
(a) 1_topicvalidation/topicvalidation.R: call results from topic validation and produce Figure 3 in the article.
(b) 2_labelvalidation/labelvalidation.R: call results from label validation and produce Figure 4 and Table 4 in the article.
(c) 3_topicvalidation_appendix/topicvalidation_appendix.R: replicate Figure SI7 and Table SI6 in the supplementary appendix.
(d) 4_labelvalidation_appendix/labelvalidation_appendix.R: replicate Figure SI9 and Table SI7 in the supplementary appendix.
(e) 5_moreinfo_appendix/moreinfo_appendix.R: replicate Figures SI25-SI32 in the supplementary appendix.
(f) master_replication.R: call scripts (a)-(e) and replicate all results. 
- Other figures and tables in the main article and the supplementary appendix are only for illustration purposes.
- Please note that we have replaced each unique Mturk worker id (the worker_id field in the results files) with a randomly generated string that won't reveal any workers' identity.

-- For readers who want to modify our code to conduct their own validation exercises, the "0_preparation" folder provides the example code we used for our article. Specifically, 
(a) 4_corpus/fittopicmodel.R: fit topic models
(b) 3_auxiliary/managequalifications.R: create and manage worker qualifications for the Mturk system
(c) 1_topicvalidation/scripts/WI_stm** OR T8WSI_stm** OR R4WSI_stm**.R: validate topics
(d) 2_labelvalidation/scripts/LI_** OR OL_**.R: validate labels
(e) master_preparation.R: call scripts (c)-(d) and produce all the validation tasks.
- We strongly recommend carefully reading sections 4 (A Checklist for Crowdsourced Topic Validation) and 5 (Software and Working Example) in our supplementary appendix. Please also note that running the code in "0_preparation" will NOT automatically generate the results in our paper because the identifiers are unique to each MTurk researcher accounts.

Results were produced by:
R version 4.0.3 (2020-10-10)Platform: x86_64-apple-darwin17.0 (64-bit)Running under: macOS Catalina 10.15.7
R Packages: validateIt_0.2.01, stm_1.3.6, wordcloud_2.6, RColorBrewer_1.1-2